-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
legacy backfill attempt #2 #632
Conversation
This reverts commit 0ec285c. This is the second time we are trying to backfill with legacy images. The first attempt failed, as detailed here: kubernetes/kubernetes#88553
These images were pushed in the last 24 hours. I expect such changes to continue to happen until the gcr.io/google-containers becomes frozen as read-only.
There are 2 commits this time --- the original PR (revert of #629), as well as a second commit to capture the delta of google-containers between yesterday and today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the untagged ones about?
/lgtm
/approve
/hold
unhold at your discretion.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: listx, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
They should be the architecture-based images that are not tagged. E.g., https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/debian-iptables@sha256:25c4396386a2d3f2f4785da473d3428bc542a6f774feee830e8b6bc6b053d11b/details?tab=info where the digest I recall a conversation with @detiber and others where it was desirable to not tag individual architecture images and only tag the manifest lists. /hold cancel |
When pushing a manifest-list image (multi-arch) do we have to manually
add all of the untagged per-arch images to promoter manifest, or is it
smart enough to drag those in as deps? I have never specified
anything but the manifest-list itself on internal promoter.
…On Thu, Mar 5, 2020 at 9:04 AM Kubernetes Prow Robot ***@***.***> wrote:
Merged #632 into master.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
It handles the specified dependencies automatically, we are doing this with the Cluster API images today :) |
That said, if you wanted the arch/platform specific images to be pullable by other tag(s), then they would need to be specified separately, if only specifying the manifest list image, the other images are only available via sha |
IMO that's what we want - we should normalize on manifest lists and curtail
per-arch tags.
…On Thu, Mar 5, 2020 at 10:57 AM Jason DeTiberus ***@***.***> wrote:
That said, if you wanted the arch/platform specific images to be pullable
by other tag(s), then they would need to be specified separately, if only
specifying the manifest list image, the other images are only available via
sha
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#632?email_source=notifications&email_token=ABKWAVEUGGDXUOS4FCXXAX3RF7Y3BA5CNFSM4LCA7FM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN6PDFQ#issuecomment-595390870>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVESAKM5FGGLPEMUKD3RF7Y3BANCNFSM4LCA7FMQ>
.
|
+1 from me, I just wanted to make sure that I had clarified my earlier statement in case someone assumed otherwise. |
So for the purpose of 1:1 backfill, untagged sha256s are OK, but I think we
generally want all "living" manifests to only have tagged images, ideally
manifests lists. Does that sound right @listx ?
…On Thu, Mar 5, 2020 at 11:18 AM Jason DeTiberus ***@***.***> wrote:
IMO that's what we want - we should normalize on manifest lists and curtail
per-arch tags.
+1 from me, I just wanted to make sure that I had clarified my earlier
statement in case someone assumed otherwise.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#632?email_source=notifications&email_token=ABKWAVFKEYAF2JBABTOPCD3RF73G5A5CNFSM4LCA7FM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN6RIQA#issuecomment-595399744>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVBGOQR4F6JUUDCRT6LRF73G5ANCNFSM4LCA7FMQ>
.
|
Yes. That being said, I want tooling to automatically (and correctly) inject images into the correct promoter manifest, without asking contributors to edit the YAML by hand. Maybe having untagged images recorded in the YAML makes sense (to track everything) --- but we'll cross that bridge when we get there. I will now post some numbers for the backfill. As expected, the backfill on its own did not finish within the 2hr prow job timeout. We are now sitting at roughly 7k of the 30k images copied in the EU region (US and ASIA show similar numbers) --- for 4 hours worth of running the promoter (this is attempt 2 of 2). If we consider that it's a 3-region thing, we actually copied 21k images in 4 hours, or about ~5k images per hour. We have about 90k - 21k or ~70k images left to copy. This means we have to run the promoter at least another ~13hrs, or about 6 or 7 more 2-hr invocations. We are now running the promoter every 4 hours, so it will take about another 24 hours or so, assuming no additional PRs, before the backfill is complete. Meanwhile, I have not observed any |
We are trying to run the backfill [1] mostly continuously to minimize the promoter's downtime for new promotions. Even without the backfill, we would still want this to run more frequently as a sanity check anyway. [1]: kubernetes/k8s.io#632
Now that kubernetes/test-infra#16640 has merged, the periodic job is running every hour. It is running right now. The cool thing I noticed is that Cloud Run is autoscaling automatically to handle all requests. I've done some grepping through the logs from the UI (because |
To get a live view of the backfill doing its thing, check out https://prow.k8s.io/?job=ci-k8sio-cip. I expect these jobs to continuously fail for the next several hours going into the evening (picking up where the last one finished at), until the backfill is complete. |
Strange, I have detected an invalid push of an image:
Digging deeper, this should not have raised an alarm because the image is part of a valid promotion here: #633 This is the case of a fat manifest. I'll work on fixing this bug tomorrow. The good news is that we can still manually verify all images after the backfill is done, by comparing the totality of images in the prod repos vs what we have defined in the promoter manifests. |
OK so the backfill finished with this Prow job: https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-k8sio-cip/1235800257359515648 at 2020-03-06 07:12:01 UTC (basically ~11PM PST on March 5). Ever since then, the same job (which runs periodically: https://prow.k8s.io/?job=ci-k8sio-cip) has finished successfully. I have noticed a bug with the auditor in parsing certain references. I will work on fixing this today. After this is fixed, I have to: (1) promote this image to prod Meanwhile, I'm doing a manual verification of what is in GCR (k8s.artifacts.prod) vs. what we have defined in our promoter manifests. |
I've finished manually verifying the images. If I run the following script from the promoter's root repo:
I essentially get identical outputs for all 6 snapshots (3 from the 3 GCR regions, the other 3 from the aggregated promoter manifest YAMLs on disk). I've opened up a PR to make the promoter manifests match what's in production by creating a PR here: #642 NOTE: It's important to pass in the |
I'm still working on fixing kubernetes-sigs/promo-tools#191. It will take a little effort, but I should be able to fix it today. |
Thanks for the update @listx! :-) |
I was overly optimistic about this; more likely timeline is today/tomorrow. |
Update on #191 here: kubernetes-sigs/promo-tools#191 (comment) |
/cc @thockin @dims